ViBERTgrid: A Jointly Trained Multi-modal 2D Document Representation for Key Information Extraction from Documents

نویسندگان

چکیده

Recent grid-based document representations like BERTgrid allow the simultaneous encoding of textual and layout information a in 2D feature map so that state-of-the-art image segmentation and/or object detection models can be straightforwardly leveraged to extract key from documents. However, such methods have not achieved comparable performance sequence- graph-based as LayoutLM PICK yet. In this paper, we propose new multi-modal backbone network by concatenating an intermediate layer CNN model, where input is grid word embeddings, generate more powerful representation, named ViBERTgrid. Unlike BERTgrid, parameters BERT our are trained jointly. Our experimental results demonstrate joint training strategy improves significantly representation ability Consequently, ViBERTgrid-based extraction approach has on real-world datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Information Extraction from Multi-Document Threads

Information extraction (IE) is the task of extracting fragments of important information from natural language documents. Most IE research involves algorithms for learning to exploit regularities inherent in the textual information and language use, and such systems generally assume that each document can be processed in isolation. We are extending IE techniques to multi-document extraction tas...

متن کامل

Information Extraction from HTML Documents Based on Logical Document Structure

The World Wide Web presents the largest Internet source of information from a broad range of areas. The web documents are mostly written in the Hypertext Markup Language (HTML) that doesn’t contain any means for semantic description of the content and thus the contained information cannot be processed directly. Current approaches for the information extraction from HTML are mostly based on wrap...

متن کامل

Multi-document Summarization for Terrorism Information Extraction

Counterterrorism is one of the major challenges to the society. In order to flight again the terrorists, it is very important to have a through understanding of the terrorism incidents. However, it is impossible for a human to read all the information related to a terrorism incident because of the large volume of information. Summarization technique is urgently required for analysis of terroris...

متن کامل

Document Representation Methods for Clustering Bilingual Documents

Globalization places people in a multilingual environment. There is a growing number of users to access and share information in several languages for public or private purpose. In order to deliver relevant information in different languages, efficient multilingual documents management is worthy of study. Generally, classification and clustering are two typical methods for documents management....

متن کامل

Extraction of Document Structure for Genomics Documents

We are taking as our foundational assumption that effective information retrieval tasks in the broad domain of biomedical literature must address the singular nature of scholarly communication and the effect this has upon a document corpus. The corpus for a typical TREC task is comprised of a temporal sequence of news documents exhibiting little, if any, internal structure. Reduction of a docum...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2021

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-86549-8_35